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Abstract 


This  project  has  shown  conclusively  that  mathematical  modeling  of  complex  data 
structures  arising  multimodal  analysis  of  multimedia  texts  has  potential  for  describing, 
identifying  interpreting  and  forecasting  socio-cultural  patterns  trends  and  instabilities 
through  the  identification  of  semantic  patterns  which  are  specific  to  different  people, 
texts  and  situational  contexts.  The  complex  data  structures  are  derived  systemic 
functional  theory  (SFT)  where  linguistic,  visual  and  audio  resources  are 
conceptualized  as  integrated  systems  of  meaning.  The  approach  moves  beyond  text 
analytics  where  concepts  are  derived  from  lexical  choices  to  a  holistic  approach  that 
takes  into  account  the  meaning  arising  from  the  interaction  of  language,  images  and 
audio  resources.  The  approach  has  significant  implications  for  discourse  analysis,  data 
mining,  search  and  retrieval  and  visual  analytics  which  currently  lack  theoretical 
frameworks  to  account  for  the  interaction  of  language  with  other  resources  in  texts. 

1.  Introduction 

The  aim  of  the  project  is  to  apply  methods  and  principles  of  dynamical  systems  theory 
(DST)  to  base  data  derived  from  systemic  functional  theory  (SFT)  analysis  of  text  and 
multimedia  resources,  with  the  aim  of  identifying  and  tracking  evolving  semantic 
patterns,  in  particular  those  related  to  stability  and  instability.  The  goal  of  the  project  is 
to  develop  theory  and  algorithms,  and  demonstrate  their  validity  and  potential  with 
case  studies  involving  multimodal  analysis  of  linguistic,  visual  and  audio  choices  in 
multimedia  texts. 

Detailed  SFT  analysis  of  six  case  studies  provided  test-case  base  data  for  DST  analysis 
in  the  first  phase  of  the  project.  The  case  studies  involved  online  discourses  about  the 
global  financial  crisis  and  climate  change,  with  a  focus  in  the  events  occurring  around 
the  time  of  the  United  Nations  Copenhagen  Climate  Change  Summit  2009  (COP  15)  in 
Copenhagen,  Denmark  on  7-18  December  2009.  The  focus  shifted  to  written  texts  and 
televised  interviews  about  the  Climatic  Research  Unit  email  controversy  involving  the 
hacking  of  a  server  at  the  Climatic  Research  Unit  at  the  University  of  East  Anglia  on 
20  November  2009  in  the  second  and  third  phases  of  the  project.  The  Climatic 
Research  Unit  email  controversy  involved  extensive  media  coverage  where  questions 
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were  raised  about  scientists’  manipulation  of  climate  data,  as  illustrated  in  the  written 
texts  and  video  segments  under  analysis. 

Software  tools  for  manipulation,  analysis,  and  visualization  of  the  SFT  base  data  for 
text  and  video  analysis  were  developed  in  order  to  map  the  ‘landscape’  on  which  the 
dynamics  of  the  texts  play  out.  This  involved  visualizing  and  understanding  the 
distribution  of  information  of  high  dimensionality.  Standard  mathematical  methods  of 
mapping  were  applied  to  the  SFT  base  data,  such  as  principle  component  analysis, 
local  linear  embedding,  recurrence  analysis,  and  clustering.  These  preliminary 
experiments  determined  both  the  identifying  features  of  the  texts  and  what  existing 
mapping  methods  are  most  useful,  before  techniques  for  capturing  the  dynamics  of 
time-stamped  multimodal  SFT  data  were  developed  in  the  final  phase  of  the  project. 

The  importance  of  the  research  is  the  development  of  theoretical  approaches  and 
mathematical  techniques  which  take  into  account  the  semantic  interaction  of  language, 
images  and  audio  resources  in  multimedia  texts.  At  present,  data  analysis  techniques 
tend  to  focus  solely  on  language,  image  and  audio  analysis  in  isolation.  In  this  project, 
these  resources  are  considered  as  inter-related  semantic  systems  which  work  together 
to  create  meaning  in  multimedia  texts  which  function  inter-textually  (i.e.  with  other 
texts)  to  create  trends  and  potential  instabilities  in  society  and  culture. 

2.  Experiment 

2.1  Systemic  Functional  Theory  (SFT)  for  Multimodal  Analysis 

In  Systemic  Functional  Theory  (SFT),  language  and  other  multimodal  resources  (i.e. 
visual,  auditory,  kinesthetic  and  spatial  resources)  are  conceptualized  as  inter-related 
semantic  systems  which  realize  four  metafunctions  (e.g.  Halliday  1978;  Halliday  & 
Matthiessen  2004;  Kress  &  van  Leeuwen  2006;  Martin  1992;  O’Toole  2011).  The  four 
metafunctions  are  concerned  with  (a)  experiential  meaning:  to  construct  our  ideas 
about  the  world;  (b)  logical  meaning:  to  establish  logical  relation  in  that  world;  (c) 
interpersonal  meaning:  to  enact  social  relations  and  create  a  stance  towards  the  ideas 
which  are  expressed;  and  (d)  textual  meaning:  to  organize  the  message.  Experiential 
and  logical  meanings  are  grouped  under  ‘ideational  meaning’  which  is  our  ideas  about 
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the  world. 


Choices  from  the  various  systems  for  language,  image  and  audio  resources  work 
together  in  multimedia  texts  to  engage  and  orientate  readers  to  particular  views  of  the 
world.  SFT  provides  a  comprehensive  conceptual  framework  for  analyzing 
informational  content  (configurations  of  agents,  participants,  processes  and 
circumstances),  the  social  relations  which  are  established  (power,  status  and  emotion), 
the  orientation  to  the  ideas  which  are  presented  (modality  and  truth  value),  and  the 
ways  in  which  the  choices  are  organized  to  achieve  specific  purposes  (e.g.  points  of 
departure,  given  and  new  information)  [1],  [2],  SFT  provided  the  base  data  for 
mathematical  analysis  in  the  project. 

2.2  Software  Tools 

2.2.1  Systemics  Software 

The  main  tool  for  creating  the  SFT  base  data  in  the  first  two  phases  of  the  project  was 
the  Systemics  software,  developed  by  Kay  O’Halloran  and  Kevin  Judd  in  1999-2002 
for  research  and  teaching  SFT.  The  original  Systemics  software  provided  a 
cross-platform  Graphical  User  Interface  (GUI)  application  for  SFT  annotation  of  text 
at  the  rank  of  word  group,  clause,  clause  complex,  and  discourse.  These  annotations 
are  stored  in  a  database.  The  software  provided  basic  search  functions  based  on  tag 
count  frequencies. 

The  Systemics  software  was  extensively  revised  and  extended  for  this  project  by 
adding  new  annotation  features,  more  sophisticated  search  features,  and  scientific 
visualization  techniques.  The  new  annotation  features  allow  better  analysis  of 
embedded  clause  structures,  discourse  chains  and  lexical  items.  The  new  search 
features  in  Systemics  include  word-tag  concordances,  complex  pattern-matching,  and 
complex  logical  relations  of  tags  across  systems  and  different  databases.  The  new 
visualization  features  in  Systemics  combined  mathematical  techniques  for  feature 
extraction,  correlation  analysis  and  cluster  analysis.  The  GUIs  in  Systemics  for  clause, 
clause  complex  and  discourse  annotations  are  displayed  in  Figures  l(a)-(c). 
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Figure  1(a)  SFT  Clause  Annotation 


Figure  1(b)  SFT  Clause  Complex  Annotation 
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Figure  1(c)  SFT  Discourse  Annotation 

In  the  third  phase  of  the  project,  the  multimodal  analysis  software  developed  in  the 
Multimodal  Analysis  Lab,  Interactive  &  Digital  Media  Institute  (IDMI)  at  the  National 
University  of  Singapore  permitted  linguistic  analysis  to  be  integrated  with  visual  and 
audio  analysis  to  generate  time-stamped  SFT  multimodal  data  for  video  texts  [3], 

2.2.2  Multimodal  Analysis  Software 

The  complexity  of  multimodal  analysis,  involving  language,  image  and  audio 
resources,  requires  a  range  of  tools  for  the  annotation,  analysis,  search  and  retrieval  of 
semantic  patterns  in  unified  but  complex  semiotic  acts;  for  example,  the  interaction  of 
language,  intonation,  gesture,  gaze,  and  camera  angle  in  videos  (O'Halloran,  Tan, 
Smith  &  Podlasov  2011;  Smith,  Tan,  Podlasov  &  O'Halloran,  2011)  The  multimodal 
analysis  software  is  organized  into  three  components  to  fulfill  these  requirements:  sets 
of  media  files,  SFT  systems  used  in  the  annotation,  and  the  annotation  units  with 
time-stamped  and  spatial  co-ordinates.  The  analyst  imports  the  media  file  and  uses  a 
pre-defmed  set  of  annotation  systems  and/or  their  own  set  of  descriptors  and  free  text 
to  annotate  the  media  by  creating  nodes  in  strips  with  pre-assigned  systems  for 
time-stamped  analysis  and  overlays  for  spatial  analysis.  The  analyst  selects  the 
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required  system  choice  from  the  menu  of  available  options  and/or  inserts  free  text.  The 
selected  option  and/or  text  are  stored  in  a  database  for  later  retrieval  and  export  for 
mathematical  analysis. 

The  GUIs  and  the  assorted  tools  and  facilities  (A,  B,  C  etc)  in  the  multimodal  analysis 
software  for  annotating  video  and  sound,  text  time-stamping  and  annotations  are 
displayed  in  Figure  2(a)-(c). 


Figure  2(a)  Sound  and  Video  Annotation  GUI 

(A)  Filmstrip  and  waveform  area;  (B)  Player  window;  (C)  Systems  Choice  window; 
(D)  Playback  controls;  (E)  General  controls;  (F)  Annotation  strip  area;  (G)  Strip 

organization  view 
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Figure  2(b)  Text  Time  Stamping  GUI 

(A)  Filmstrip  and  waveform  area;  (B)  Clause  overlap  navigation  area;  (C)  Time-stamp 
clause  view;  (D)  Time-stamp  clause  table  view;  (E)  Systems  choice  window  (F) 

Clause  editor 


Figure  2(c)  Screenshots  of  Interviewees  with  Overlays  (Fox  News) 
(A)  Dr  Kevin  Trenberth  (B)  Myron  Ebell 
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The  annotation  units  (the  nodes  and  overlays)  containing  the  system  choices  for 
linguistic,  visual  and  audio  resources  are  related  to  each  other  both  in  terms  of  time 
and  space.  The  ability  to  precisely  encode  the  spatial-temporal  relations  between  the 
different  choices  and  store  them  in  a  database  for  later  retrieval  and  analysis  is  a  key 
step  forward  for  advancing  our  knowledge  and  understanding  of  how  choices  integrate 
to  create  meaning  in  dynamic  media.  In  addition,  facilities  are  provided  for  defining 
and  annotating  network-like  relationships  between  the  annotation  units.  These 
relationships  are  implemented  as  nested  links  and  chains,  which  the  analyst  codes  by 
clicking  on  an  annotation  unit  and  linking  it  to  another  annotation  unit.  The  links 
themselves  are  annotated  using  system  choices  for  inter-semiotic  relations. 

Automated  algorithms  which  are  generic  enough  to  enhance  productivity  are  also 
implemented  in  the  multimodal  analysis  software:  for  example,  video  shot  detection 
for  identifying  significant  changes  in  the  video;  audio  silence/speech/music 
classification  for  identifying  intervals  of  likely  silence,  speech  or  music;  face  detection 
for  identifying  faces  in  videos  and  images;  tracking  for  automatically  tracking  objects 
in  videos;  and  optical  flow  for  detecting  the  motion  of  objects,  surfaces,  and  edges. 

Search,  retrieval  and  export  facilities  in  the  software  permit  the  SFT  multimodal  base 
data  to  be  imported  into  third-party  software  for  mathematical  analysis  and 
visualization. 

2.3  Mathematical  Analysis  and  Visualization 
2.3.1  Techniques  for  SFT  Linguistic  Analysis 

The  aim  of  the  mathematical  analysis  is  to  reveal  and  understand  how  meaning  is 
being  made  in  texts,  in  particular  the  dynamic  accumulation  of  meaning  as  the  text 
unfolds.  The  SFT  linguistic  annotations  provide  an  extensive  decomposition  of  the  text 
into  functional  elements,  typically  word  groups  in  clauses  which  function  together  as  a 
semantic  unit.  The  meaning  potential  of  these  functional  elements  is  multidimensional 
in  the  sense  that  each  element  plays  a  role  in  the  different  SFT  systems.  This  results  in 
a  complex  data  structure,  where  the  text  is  decomposed  in  word  groups,  which  are 
further  grouped  into  larger  and  larger  groups  which  are  analyzed  multiple  times 
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according  to  their  metafunctional  roles.  The  data  structure  includes  annotations  that  are 
attribute  tags  attached  to  each  element,  or  group,  where  the  attribute  tags  are  options 
drawn  from  the  hierarchically  organized  SFT  systems. 

One  of  the  projections  of  this  data  structure  we  have  extensively  explored  is  clause-tag 
associations,  which  can  be  conveniently  represented  as  a  binary  matrix.  In  this  matrix 
representation  each  row  is  associated  with  a  clause,  each  column  is  associated  with  a 
tag,  so  tags  are  attached  to  the  corresponding  clause  and  vice  versa.  In  this  data 
projection  the  text  is  represented  as  a  cloud  of  points  in  a  dual  vector  space,  the 
clause-space  and  tag-space,  corresponding  to  the  row  and  column  of  the  binary  matrix. 
The  text  can  be  investigated  through  examination  of  the  dual  space,  for  example,  using 
singular  value  decomposition  (SVD)  and  clustering  techniques.  The  features  of  the  text 
are  visualized  using  various  network  diagrams  and  by  projection  of  the  features  back 
onto  the  text  using  color  tints  and  font  attributes.  The  various  visual  renderings  are 
transformations  and  filterings  of  the  underlying  data  structure. 

2.3.2  Clustering  Techniques  for  SFT  Multimodal  Analysis 

The  complexity  of  the  SFT  data  structure  is  increased  in  multimodal  analysis,  where 
time-stamped  linguistic  annotations  and  image  and  video  analyses  (e.g.  camera  angle, 
gaze  vectors,  on-screen  engagement  etc)  introduce  the  additional  dimension  of  time. 
Dimensionality  reduction  was  undertaken  using  clustering  techniques  for  the  k-means 
algorithm  (MacQueen  1967)  where  k  is  the  number  of  clusters  and  binary  coding  is 
applied  to  the  annotations.  The  entire  system  was  divided  into  different  k  clusters  for 
the  different  metafunctions  (textual,  interpersonal  and  ideational)  and  the  video 
analysis,  based  on  iterative  techniques  to  get  the  best  value  for  k.  In  addition,  network 
diagrams  showed  the  transitions  between  clusters  for  different  speakers. 

One  disadvantage  of  this  approach  is  that  k-means  clustering  is  very  sensitive  to 
cluster  centers  (or  choice  combinations)  so  that  clauses  belonging  to  the  same  cluster 
may  not  have  exactly  the  same  set  of  annotations.  For  this  reason,  the  k  value  must  be 
carefully  selected,  and  in  our  case,  different  k  values  were  assigned  according  to  the 
number  of  available  choices  for  the  different  systems  for  each  metafunction. 
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2.3.3  Allen’s  Interval  Algebra  for  SFT  Multimodal  Analysis 


To  further  investigate  how  different  annotations  for  linguistic,  visual  and  video 
systems  work  together  to  create  meaning,  patterns  of  combinations,  trends  and  outliers 
were  analyzed  using  an  algorithmic  approach  based  on  Allen’s  (1983)  interval  algebra 
[4],  This  approach  is  explained  in  some  detail  because  it  was  used  to  mathematically 
model  semantic  choices  as  they  interact  over  time. 

For  the  SFT  multimodal  data,  let  annotation  A  be  a  set  of  annotation  units  u, 
A  =  i  =  l..N,  where  A  is  a  number  of  annotation  units  in  the  annotation.  In  case 

of  video  analysis,  every  annotation  unit  u  is  a  triplet  u  =  (tx,t2,c^,  tt  <  t2,  where  tx  is 
start  timestamp,  t2  is  end  timestamp  and  c  is  the  system  choice  associated  with  this 
annotation  unit.  In  other  words,  the  annotation  unit  defines  an  interval  on  the  time  axis 
and  the  system  choice  attached  to  that  interval.  Further,  we  assume  that  all  annotation 
units  u  belong  to  the  same  annotation  A. 

In  order  to  describe  recurring  sets  of  annotation  units,  relate  sets  of  units  to  each  other 
and  identify  whether  a  new  set  of  units  forms  the  same  pattern  as  earlier  occurring  sets 
we  use  a  fuzzy  adaptation  of  Allen’s  (1983)  interval  algebra  proposed  in  Snoek  & 
Worring  (2005).  This  framework  defines  eight  logical  relationships,  referred  to  as 
Allen  s  relationships,  stating  that  any  two  given  time  intervals  ul  and  //,  may  be: 


1 .  Not  related  (N) 

2.  precedes  ii2  (P) 

3.  / ix  meets  //,  (M) 

4.  ux  overlaps  //,  (O) 


5.  ux  starts  with  ii2  (S) 

6.  ul  is  during  //,  (D) 

7.  ux  finishes  with  u2  (F) 

8.  ux  equals  u2  (E) 

9.  Not  defined  (-) 


When  the  9th  relationship  holds,  the  ordering  of  ux  and  //,  must  be  reversed  to 
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identify  which  relationship  from  1  to  8  takes  place.  Let  us  denote  Allen’s  relationship 
for  time  intervals  ux  and  //,  as  a(uvu2).  Any  set  of  annotation  units  ul,...,uK 

defines  a  square  matrix  V(uv...,uK)  =  (aij(ui,ujy  where  i,j  In  other  words, 

any  pairwise  combination  of  annotation  units  has  a  corresponding  Allen’s  relationship, 
and  matrix  P  describes  how  annotation  units  are  related  to  each  other  in  Allen’s  sense 
in  the  given  set  of  units.  Obviously,  the  main  diagonal  elements  of  this  matrix  are 
equal  to  E,  since  any  annotation  unit  is  equal  to  itself.  In  order  to  take  system  choices 
of  annotation  units  into  account  we  define  a  vector  of  system  choices 
c (ul,...,uK)  =  (cl,...,cK^,  with  elements  being  system  choices  from  annotation  units 
uv...jik. 

We  define  a  pattern  n  as  a  pair  II  =  (P,c) ,  where  P  is  a  matrix  of  Allen’s 
relationships  of  size  K  xK  and  c  is  a  vector  of  choices  of  size  K.  The  set  of 
annotation  units  ul,...,nK  is  said  to  belong  to  pattern  n  =  (P*,c*)  if  P(ut,...,uK)  =  P* 
and  c(ul,...,uK)  =  c*.  This  definition  naturally  demands  the  annotation  units  to  be  in 
the  same  configuration  in  Allen’s  sense  and  have  the  same  system  choices.  The 
proposed  definition  of  the  pattern  enables  us  to  move  from  operating  with  timestamps 
to  operating  with  Allen’s  relationships,  providing  a  mathematical  basis  to  process  and 
compare  sets  of  annotation  units  in  a  semantically  meaningful  domain. 

The  pattern  histogram,  which  is  basically  counting  of  patterns  of  size  K,  is  the  most 
basic  technique  used  for  this  method.  The  algorithm  is  outlined  as  follows: 

For  all  possible  combinations  of  annotation  units  by  K. 

1.  Calculate  pattern  II  for  a  combination  uv...,uK. 

2.  Assign  counter  1  for  II  in  case  it  never  occurred  before  or  increment  the 
counter  otherwise. 

The  pattern  histogram  is  used  to  identify  the  most  frequent  patterns  in  the  SFT 
multimodal  base  data.  The  main  disadvantages  of  this  approach  are  the  large  number 
of  patterns  discovered  and  the  necessity  to  explicitly  define  parameter  K,  the  size  of 
the  patterns  to  look  for.  Even  a  simple  analysis  may  generate  thousands  of  unique 
combinations,  and  this  number  grows  exponentially  with  increase  of  K.  Therefore,  we 
propose  a  basic  filtering  technique  for  pattern  histogram  calculation,  which  filters  out 
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all  patterns  having  (N),  i.e.  the  1st  Allen’s  relationship.  This  approach  is  motivated  by 
the  idea  that  if  two  annotation  units  are  ‘not  related’  in  time,  than  it  does  not  make 
sense  to  consider  these  units  in  the  pattern.  This  technique  fdters  the  vast  majority  of 
the  patterns,  but  the  total  number  is  still  too  high  to  be  interpreted  by  the  human 
analyst. 

The  large  number  of  unique  patterns  is  generated  because  the  algorithm  takes  all 
possible  combinations  of  annotation  units  into  account.  Therefore,  even  a  small 
number  of  annotation  units  may  generate  much  higher  number  of  patterns  since  pattern 
histogram  does  not  favor  any  pattern,  accounting  for  them  all.  Further  filtering  of 
pattern  histogram  can  be  done  based  on  the  assumption  that  one  may  be  interested  in 
more  repeated  patterns  than  in  less  repeated  ones.  We  may,  therefore,  require  that 
patterns  with  a  lower  counter  may  not  share  annotation  units  with  higher  counter 
patterns,  that  is,  annotation  units  that  contribute  to  a  pattern  with  a  higher  counter  may 
not  contribute  to  a  less  frequent  pattern.  The  algorithm  may  be  outlined  as  follows: 

1 .  Calculate  pattern  histogram. 

2.  Sort  patterns  in  histogram  by  their  counter. 

3.  Starting  from  pattern  n  with  higher  counter. 

4.  Check  if  there  are  patterns  with  lower  counter  sharing  annotation  units  with  n. 

5.  Reduce  their  counters  accordingly. 

6.  Delete  pattern  II  if  its  counter  reaches  0. 

7.  Repeat  for  all  patterns. 


This  approach  favors  highly  repeated  patterns  to  less  frequent  patterns  and  greatly 
reduces  the  total  number  of  patterns  in  the  histogram  making  it  easier  to  interpret 
manually.  The  approach  still  requires  the  pattern  size  K  to  be  explicitly  defined, 
however.  This  is  problematic  since  it  is  difficult  to  estimate  K  in  advance  and  the 
different  patterns  in  the  SFT  multimodal  data  may  have  different  sizes.  To  address 
these  problems,  an  alternative  approach  was  developed. 

Consider  a  pattern  II  of  size  K> 2  repeated  n  times,  and  let  us  investigate  the  results 
for  patterns  of  size  K- 1  in  the  same  SFT  multimodal  base  data.  Naturally,  sub-patterns 
of  II  will  be  discovered  and  these  sub-patterns  will  have  the  counter  bigger  or  equal 
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n.  In  any  annotation,  smaller  size  patterns  contributing  to  a  bigger  size  pattern  are 
present  as  frequently  or  more  frequently  than  the  bigger  pattern  itself.  In  this  sense,  we 
can  identify  larger  size  patterns  by  analyzing  their  smaller  size  components.  In  fact,  we 
can  calculate  pattern  histogram  for  K=  2  and  then  analyze  it  to  identify  patterns  of  any 
size  by  looking  at  how  patterns  share  annotation  units.  Sub-patterns  of  a  bigger  pattern 
use  the  same  annotation  units  or,  more  precisely,  their  sets  of  annotation  units  intersect 
to  a  great  extent.  Checking  all  patterns  pairwise  can  identify  this  intersection  by 
detecting  the  annotation  units,  which  belong  to  both. 

This  approach  converts  the  pattern  histogram  into  a  network,  where  the  nodes  of  the 
network  are  patterns,  and  edges  between  two  patterns  are  recorded  when  sharing  of 
annotation  units  is  detected.  Patterns  of  bigger  size  form  clusters  of  densely 
interconnected  clusters  of  nodes  in  the  network,  which  can  be  later  detected  by  parsing 
the  network  structure.  This  approach  combined  with  a  mutual  exclusiveness 
requirement  for  a  histogram  makes  identification  of  bigger  size  patterns  practically 
possible  since  it  avoids  exponential  complexity  explosion  by  looking  at  patterns  of  size 
K= 2  only. 

These  mathematical  techniques  were  applied  to  SFT  linguistic  base  data  and  SFT 
multimodal  base  data  derived  from  the  following  case  studies. 

2.4  The  Case  Studies 

2.4.1  Global  Financial  Crisis  and  Climate  Change 

Six  case  studies  selected  for  analysis  in  the  first  phase  of  the  project.  Case  Study  1  is 
concerned  with  a  financial  advisor's  view  of  the  global  financial  crisis  which  unfolded 
in  2008,  while  Case  Studies  2  to  6  are  selected  from  a  corpus  of  texts  on  climate 
change,  in  particular  those  focusing  on  events  surrounding  the  United  Nations 
Copenhagen  Climate  Change  Summit  2009  (COP  15)  in  Copenhagen,  Denmark  on 
7-18  December  2009.  The  financial  crisis  and  climate  change  were  chosen  on  the  basis 
of  their  global  significance  and  the  evolution  of  media  reporting  about  these  two 
events.  The  reporting  of  the  financial  crisis  has  a  shorter  time  span  compared  to 
climate  change  which  has  been  the  subject  of  discussion  for  decades.  Both  issues  are 
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currently  being  reported,  however,  within  an  environment  where  there  is  a  basic 
distrust  of  the  different  interest  groups  and  mainstream  institutions  (e.g.  banks). 

The  six  case  studies  are: 

1.  Title:  ‘Commentary:  Why  there  is  a  crisis  -  and  how  to  stop  it’ 

Author:  David  Smick 

Source:  CNN  News 

http:  //edition,  cnn.  com/2008/POLITICS/10/09/smick.  crisis/index,  html 
Date:  10  October  2008 
Type:  Website 

Description:  A  financial  advisor  presents  his  views  regarding  the  origins  of  the 
global  financial  crisis  and  what  needs  to  be  done  in  order  to  restore  the  situation. 

2.  Title:  ‘Are  climate  scientists  over-selling  their  models?’ 

Author:  Fred  Pearce 

Source:  New  Scientist 

http://www.newscientist.corn/article/mg20026851.900-are-climate-scientists-o 

verselling-their-models.html?full=true 

Date:  4  December  2009 

Type:  Website 

Description:  Professor  Lenny  Smith,  a  climate  scientist  at  the  London 
School  of  Economics,  is  interviewed  regarding  the  usefulness  of  climate 
models  for  forecasting  climate  and  weather  patterns. 

3.  Title:  ‘Hackers  target  leading  climate  research  unit’ 

Author:  BBC  News  Online 

Source:  BBC  News 

http://news.bbc.co.Uk/2/hi/8370282.stm 
Date:  20  November  2009 
Type:  Website 

Description:  The  text  is  a  news  report  on  the  email  hacking  incident  that  occurred 
at  the  Climatic  Research  Unit  at  the  University  of  East  Anglia  in  November  2009, 
just  before  the  United  Nations  conference  on  climate  change  (COP  15)  in 
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Copenhagen. 


4.  Title:  ‘Hackers  steal  electronic  data  from  top  climate  research  center’ 

Author:  Juliet  Eilperin 

Source:  The  Washington  Post 

http://www.washingtonpost.com/wp-dyn/content/article/2009/ll/20/AR20091120 

04093.html 

Date:  21  November  2009 
Type:  Website 

Description:  The  text  also  reports  on  the  email  hacking  incident  that  occurred  at 
the  Climatic  Research  Unit  at  the  University  of  East  Anglia. 

5.  Title:  ‘Q&A:  Professor  Phil  Jones’ 

Author:  Roger  Harrabin 
Source:  BBC  News 

http://news.bbc.co.Uk/2/hi/85 1 1670.stm 
Date:  13  February  2010 
Type:  Website 

Description:  Roger  Harrabin,  one  of  the  world’s  most  senior  environment  and 
science  journalists,  interviews  Professor  Phil  Jones,  who  was  head  of  the  Climatic 
Research  Unit  at  the  University  of  East  Anglia  in  Britain  when  the  email  hacking 
incident  occurred. 

6.  Title:  ‘Phil  Jones  momentous  Q&A  with  BBC  reopens  the  “science  is 
settled”  issues’ 

Author:  Indur  M.  Goklany 
Source:  Watts  Up  with  That 

http://wattsupwiththat.com/20 1 0/02/ 1 4/phil-jones-momentous-qa-with-bbc-reopen 
s-the-science-is-settled-issues/ 

Date:  14  February  2010 
Type:  Website 

Description:  The  text  is  a  blog  entry  from  the  well-known  climate  change  blog 
Watts  Up  with  That,  managed  by  Anthony  Watts,  an  American  broadcast 
meteorologist.  The  text  was  written  by  a  guest  writer,  Indur  M.  Goklany. 
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The  six  texts  were  selected  on  the  basis  of  their  relations  with  each  other.  That  is,  Case 
Studies  1  and  2  were  chosen  to  see  how  experts  from  two  different  domains,  one  of 
science  and  another  of  economics  and  finance,  make  use  of  linguistic  choices  to 
achieve  their  communicative  intent.  Case  Studies  3  and  4  were  selected  to  investigate 
how  texts  from  different  news  agencies  reported  on  the  same  event,  and  Case  Studies  5 
and  6  provided  insights  into  the  framing  of  expert  opinions  about  an  event.  The 
analysis  of  the  six  case  studies  reveals  significant  differences  in  how  the  resources  of 
language  are  employed  to  communicate  information  and  influence  readers. 

2.4.2  Climate  Change  and  Email  Hacking  Incident:  Written  Reports 

The  BBC  News  and  Washington  Post  texts  (Case  Studies  3  and  4)  about  the  email 
hacking  incident  at  the  Climatic  Research  Unit  at  the  University  of  East  Anglia  in 
November  2009  were  revisited  in  phase  two  of  the  project  to  further  examine  the 
usefulness  of  the  visualization  facilities  in  Systemics  and  to  assist  in  the  development 
of  mathematical  tools  for  tracking  the  dynamics  of  the  text.  The  two  texts  are  initial 
reports  of  the  ‘Climategate’  incident,  which  happened  about  two  weeks  before  the 
United  Nations  Conference  on  Climate  Change  in  Copenhagen  in  December  2009.  The 
BBC  News  and  the  Washington  Post  texts  were  chosen  to  investigate  ideological 
differences  in  the  two  reports. 

2.4.3  Climate  Change  and  Email  Hacking  Incident:  Televised  Interviews 

The  focus  turned  towards  televised  interviews  about  the  email  hacking  incident  at  the 
Climatic  Research  Unit  at  the  University  of  East  Anglia  in  the  third  phase  of  the 
project.  Two  videos  from  Fox  News  (http://www.foxnews.com/)  and  CNN  News 
(http://edition.cnn.com)  were  analyzed  in  terms  of  linguistic,  image  and  video  systems 
for  textual,  interpersonal  and  experiential  meanings  for  the  purpose  of  examining  the 
interactional  and  experiential  content  of  the  video  and  the  degree  of  persuasiveness 
with  which  each  interviewee  puts  forth  his  case.  The  details  of  these  two  videos  are 
given  below. 

1 .  Program:  Happening  Now,  a  Fox  News  Corporation  breaking-news  programme 
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Date:  25  November  2009 

http  ://video .  foxnews .  com/  v/3945521/  illegal-act 

Interviewer:  Jon  Scott 

Interviewees: 

•  Dr.  Kevin  E.  Trenberth:  Distinguished  Senior  Scientist  in  the  Climate  Analysis 
Section  at  the  National  Center  for  Atmospheric  Research  in  Colorado 

•  Mr.  Myron  Ebell  (Director  of  energy  and  global  warming  policy  at  the 
Competitive  Enterprise  Institute,  Washington  DC). 

2.  Program:  Campbell  Brown,  a  former  CNN  news  program 
Interviewer:  Campbell  Brown 
Date:  7  December  2009 

http://www.youtube.com/watch?v=Tsh7QUy4CvE 

http://www.youtube.com/watch?v=ucz_iCJCoZE&feature=watch_response_rev 

Interviewees: 

•  Chris  Homer:  Senior  Fellow,  Center  for  Energy  and  Environment  the  Competitive 
Enterprise  Institute  in  Washington  DC 

•  Stephen  McIntyre:  Mathematician  and  founder  and  editor  of  Climate  Audit,  a  blog 
devoted  to  the  analysis  and  discussion  of  climate  data 

•  Michael  Oppenheimer:  Albert  G.  Milbank  Professor  of  Geosciences  and 
International  Affairs  in  the  Woodrow  Wilson  School  and  the  Department  of 
Geosciences  at  Princeton  University  and  Director  of  the  Program  in  Science, 
Technology  and  Environmental  Policy  (STEP)  at  the  Woodrow  Wilson  School 
and  Faculty  Associate  of  the  Atmospheric  and  Ocean  Sciences  Program,  Princeton 
Environmental  Institute,  and  the  Princeton  Institute  for  International  and  Regional 
Studies. 

3  Results  and  Discussion 

3.1  Semantic  Patterns  and  Comparative  Analysis 

One  of  the  key  innovations  of  the  project  is  that  many  of  the  qualitative  aspects  of 
meaning  making  in  a  text  previously  described  by  Halliday  (1978),  Halliday  and, 
Matthiessen  (2004),  Martin  (1992)  and  others,  can  be  associated  with  quantifiable 
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aspects  of  our  data  structures  [3],  [4],  [5],  [6],  [7].  For  example,  qualitative  features 
can  be  identified  with  reference  points  in  the  clause-tag  dual-space.  The  degree  to 
which  a  text  possesses  a  feature  can  be  described  in  terms  of  barycentric  coordinates 
with  respect  to  predefined  reference  points  and  metrics.  The  findings  for  the  six  case 
studies  are  described  below. 

In  the  CNN  News  text  (Case  Study  1),  the  financial  advisor  presents  his  views 
regarding  the  origins  of  the  global  financial  crisis  and  what  needs  to  be  done  in  order 
to  restore  the  situation.  The  analysis  reveals  that  although  on  the  surface  the  text 
appears  to  present  an  objective  view  of  the  financial  crisis,  there  are  multiple 
underlying  strategies  where  the  author  uses  a  range  of  linguistic  systems  (particularly 
modality  and  transitivity  systems)  in  a  metaphorical  fashion  to  present  himself  an 
authority  with  knowledge  of  both  the  causes  and  solutions  to  the  global  financial  crisis. 
This  may  be  compared  to  the  New  Scientist  text  (Case  Study  2),  where  the  scientist 
constantly  qualifies  his  statements  about  the  usefulness  of  climate  models  through  the 
use  of  congruent  modality  resources  (e.g.  Finite  elements,  Mood  Adjuncts),  unlike  the 
global  financial  advisor  in  Case  Study  1  who  uses  metaphorical  resources  to  achieve  a 
high  level  of  apparent  certainty. 

The  visualization  techniques  revealed  semantic  patterns  in  these  texts  which  otherwise 
would  have  been  difficult  to  detect.  For  example,  in  the  CNN  News  text  (Case  Study  1), 
recurrence  plots  revealed  phases  in  transitivity  patterns  in  descriptions  of  the  global 
financial  crisis  corresponding  to  the  author’s  recount  of  events  (in  terms  of  material 
actions)  and  his  solution  to  the  problem  (in  terms  of  relations  between  different 
entities)  as  displayed  in  Figure  3.  Significantly,  the  semantic  patterns  involve 
interactions  across  different  grammatical  systems.  For  example,  recurrence  plots  were 
used  to  identify  relations  across  system  choices  for  textual  organization  and  logical 
meaning  in  Figure  4  and  transitivity  and  lexical  strings  which  function  to  amplify  the 
magnitude  of  the  final  crisis  in  Figure  5.  The  analyses  reveal  the  inter-dependency  of 
semantic  systems  and  illustrate  the  need  to  adopt  a  multi-layered  and  multi-faceted 
analysis  of  text  [6],  [7]. 
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►  Clause  7  to  55 


■  Clause  56  to  80 


Phase  1 

Phase  II 

Figure  3  Transitivity  Phases 


Figure  4  Thematic  Patterns  and  Logical  Meaning 
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Figure  5  Transitivity  and  Lexis:  Amplification  of  Financial  Crisis 


The  BBC  News  text  (Case  Study  3)  was  one  of  the  first  news  reports  about  the  email 
hacking  incident  at  the  Climatic  Research  Unit  at  the  University  of  East  Anglia  to 
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emerge.  The  analysis  reveals  that  the  text  producers  reconstruct  the  event  in  terms  of  a 
theft  of  information  (i.e.  a  burglary),  which  functions  to  subordinate  the  controversy 
regarding  claims  of  data  manipulation.  The  focus  of  the  article  is  directed  towards 
security  measures  at  the  university,  rather  than  the  researchers  working  in  the  Climatic 
Research  Unit.  The  SVD  illustrated  key  semantic  features  of  texts  [1],  [7];  for  example, 
the  use  of  modality  (i.e.  truth  value)  and  co-occurrence  of  semantic  tags  (Finite  Modal 
and  ‘TH’  Subject)  as  displayed  in  Figure  6. 
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Figure  6  Tag  Wheel  and  Text  Visualization 


On  the  other  hand,  the  Washington  Post  (Case  Study  4)  focuses  on  the  controversy 
arising  from  the  email  hacking  event  and  positions  climate  change  proponents  and 
climate  change  skeptics  as  opposing  parties,  with  the  proponents  being  presented  as 
defensive  and  the  skeptics  as  objective  and  confident  in  their  claims.  Later,  the  BBC 
News  (Case  Study  5)  Professor  Jones  is  questioned  on  several  points  arising  as  a  result 
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of  this  controversy  -  including  the  use  of  the  word  ‘trick’  and  the  accusations  that  the 
science  behind  global  warming  is  not  as  strong  as  climate  scientists  have  argued  it  to 
be.  The  analysis  reveals  how  the  scientist  tends  to  use  relational  processes  to  describe 
particular  states,  without  drawing  upon  interpersonal  resources  to  make  explicit 
evaluations  of  those  states,  unlike  the  Watts  Up  with  That  text  (Case  Study  6)  where 
modality  is  frequently  used.  The  findings  suggest  that  climate  change  proponents  and 
climate  change  denialists  may  rely  on  different  meaning-making  strategies,  particularly 
in  relation  to  the  expression  of  uncertainty  and  doubt.  Interactive  visualization  tools 
permitted  comparison  of  such  semantic  patterns  across  the  six  case  studies,  as 
displayed  in  Figure  7. 


Figure  7  Interactive  Visualization  Comparing  the  Features  of  Six  Texts 


While  qualitative  aspects  of  meaning  making  in  a  text  are  associated  with  quantifiable 
aspects  of  data  structures,  making  possible  the  visualization  of  semantic  patterns, 
modeling  the  dynamics  of  the  unfolding  meaning  in  a  text  proved  more  challenging. 
Different  approaches  were  explored,  for  example,  the  accumulation  of  semantic 
features  over  the  logical  structure  of  the  text  in  Figure  8,  state  machines  derived  from 
projection  and  clustering  of  the  underlying  data  structure  in  Figure  9  and  animations  of 
unfolding  features  within  a  text  in  Figure  10. 
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Figure  8  Accumulation  of  Features  over  Logical  Structure 


Figure  9  State  Machine  Based  on  Clustering  in  Dual-Space 


Figure  10  Unfolding  Features  in  a  Text 

3.2  Generic  Profiles  for  Written  Texts 

One  of  the  key  advantages  of  our  quantitative  description  of  meaning  making  in  texts 
is  that  it  enables  comparative  analysis  of  texts,  and  the  identification  of  features  of  a 
text  that  deviate  from  genre  norms,  thus  making  it  possible  to  interpret  covert 
messages  (experiential,  logical,  interpersonal  and  textual)  which  are  not  immediately 
apparent.  For  example,  a  comparison  of  mathematical  visualizations  (i.e.  neighborhood 
plots,  recurrence  plots,  tag  wheels)  for  the  BBC  News  (Case  Study  3)  and  Washington 
Post  texts  (Case  Study  4)  about  the  email  hacking  incident  at  the  University  of  East 
Anglia  reveal  different  linguistic  properties  and  construals  of  the  event,  as  explained 
below. 

The  BBC  News  and  the  Washington  Post  texts  are  essentially  news  recounts  of  the 
same  event.  Both  texts  contain  a  third-person  recount  of  the  event,  following  which 
certain  individuals  are  called  upon  to  give  their  thoughts  and  opinions  with  regard  to 
the  event.  However,  even  with  a  similar  communicative  purpose,  the  two  texts  differ  in 
the  construal  of  the  event.  The  BBC  News  text  has  a  narrower  focus  and  contains  a 
smaller  variety  of  lexical  verbs,  focusing  mainly  on  the  statements  made  by  the 
affected  university,  the  police  and  the  IT  expert.  In  contrast,  the  Washington  Post  text 
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expands  the  semantic  field  to  include  verbs  which  ‘do’  more  than  just  make 
statements,  such  that  the  resultant  effect  is  a  degree  of  ‘action’  which  exceeds  the 
generic  expectation  of  a  recount  as  the  re-telling  of  facts.  Simply  put,  there  is  a  lot  less 
'action'  in  the  BBC  News  text  compared  to  the  'drama'  construed  in  the  Washington 
Post  text.  Aided  by  particular  tense  and  modality  values  assigned  to  participants  in  the 
text  (in  particular,  those  who  were  invited  to  give  their  views  on  the  incident),  the 
‘who’  and  their  respective  actions  take  centre  stage  in  the  Washington  Post,  while  the 
BBC  News  text  focuses  on  the  event  and  ‘what’  happened. 

The  quantification  of  semantic  features  for  generating  generic  text  profiles  and  speaker 
profiles  for  email  hacking  event  was  explored  in  relation  to  SFT  multimodal  base  data 
for  video  texts  in  the  final  phase  of  the  project. 


3.3  Generic  Profiles  for  Video  Texts 


3.3.1  Cluster  Distribution  over  Time 


Figure  11  Time-Stamped  SFT  Multimodal  Data  Base 
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The  visual  complexity  of  the  time-stamped  SFT  multimodal  data  for  the  Fox  News 
interview  about  the  email  hacking  incident  at  the  Climatic  Research  Unit  at  the 
University  of  East  Anglia  (155  clauses  with  198  different  types  of  linguistic  and 
image/video  annotations)  is  displayed  in  Figure  11.  The  colors  red,  pink  and  black 
correspond  to  linguistic,  visual  and  video  choices  for  Jon  Scott,  the  interviewer 
(Speaker  1),  Dr  Kevin  Trenberth,  the  climate  scientist  (Speaker  2)  and  Mr  Myron  Ebell, 
the  climate  denialist  (Speaker  3)  respectively. 

Mathematical  techniques  were  applied  to  the  multimodal  SFT  base  data  to  interpret  the 
news  debate  genre  where  a  seemingly  unstructured  conversational  context  is  actually 
governed  by  codes  of  behavior  regarding  the  conduct  of  the  communicative  event  and 
the  nature  of  the  participant  roles.  From  clustering  and  network  visualizations,  we 
investigated  some  of  these  norms;  for  example,  how  limited  speaking  time  for 
participants  leads  to  competition  for  control  of  the  dialogic  space,  especially  in  a 
debate  where  participants,  in  this  case  Dr  Trenberth  and  Mr  Ebell,  address  certain 
issues  from  opposing  points  of  view. 

The  k  values  for  the  k-means  clustering  for  the  system  annotations  were  assigned 
according  to  metafunction  (textual,  interpersonal,  ideational)  and  resource  type 
(language  and  image/video).  The  resulting  k  value  are:  Textual:  8  (155*20  matrix  with 
633  non-zero  elements);  Interpersonal:  12  (155*81  matrix  with  1333  non-zero 
elements);  Ideational:  12  (155*79  matrix  with  954  non-zero  elements)  and 
image/video:  8.  Figure  12  shows  the  cluster  distribution  over  time  for  Jon  Scott,  Dr 
Trenberth  and  Mr  Ebell  (red,  pink  and  black  respectively),  where  clusters  from  1  to  8 
belong  to  Textual  system,  clusters  9  to  20  belong  to  Interpersonal  system,  Clusters  21 
to  32  belong  to  Ideational  system  for  language  and  -8  to  -1  belong  to  video.  The 
dimensionality  of  the  data  matrix  has  been  reduced  from  155*198  to  155*40. 
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Figure  12  The  Distribution  of  the  Clusters  over  Time 


The  most  repeated  cluster  combinations-of-three  were  analysed  to  find  semantic 
patterns  in  the  discourse  and  the  most  significant  combinations  for  the  three  speakers. 
Figure  13  shows  the  occurrence  of  most  repeated  cluster  combinations-of-three,  where 
three  distinct  episodes  were  identified.  The  first  episode  contains  a  variety  of  clusters, 
while  the  second  episode  has  more  variation  within  a  tighter  time  frame.  The  third 
episode,  while  somewhat  similar  in  terms  of  length  of  time  frame  as  the  second 
episode,  contains  a  distinctly  different  set  of  cluster  combinations-of-three.  These  three 
episodes  are  examined  in  more  detail  below. 


Figure  13  Most  frequent  cluster  combinations-of-three 


Visually,  the  flow  of  contributions  from  Jon  Scott,  Dr  Trenberth  and  Mr  Ebell 
(coloured  red,  pink  and  black  respectively)  in  the  first  episode  develops  into  a 
somewhat  frenzied  ‘exchange’  with  more  frequent  short  bursts  amidst  the  increased 
variation  in  cluster  combination-of-three  in  the  second  episode,  following  which  there 
is  a  recapitulation  of  sorts  in  the  third  episode,  but  it  is  not  a  repetition  of  the  first 
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episode.  Mr  Ebell  is  much  less  prominent  in  the  last  episode,  in  contrast  to  the  middle 
episode  where  he  seems  to  dominate  the  discourse,  particularly  in  comparison  to  the 
first  and  last  episodes. 

The  repeated  combination  reveal  discourse  patterns  which  are  indicative  of  an 
interesting  phenomenon.  Using  Jon  Scott  as  an  example,  we  see  a  marked  difference  in 
cluster  combination-of-three  use  between  the  first  episode  and  the  third.  If  we  refer 
back  to  the  Fox  News  interview,  we  see  that  Jon  Scott,  as  interviewer,  is  trying  to  bring 
the  news  debate  interview  to  a  close  in  the  third  episode  by  acknowledging  his  guests. 
However,  Dr  Trenberth  interjects  with  new  information  and  Jon  Scott  only  gets  as  far 
as  uttering  his  guests’  names  before  he  is  interrupted.  Thus,  there  is  a  repetition  of 
cluster  combination-of-three  which  are  ‘nil  choices’  as  the  clauses  labeled  with  this 
combination-of-three  are  ‘Minor  Clauses’  which  do  not  have  tag  annotations  in  either 
of  the  three  metafunctions  because  they  do  not  carry  textual,  interpersonal  or 
experiential  meaning. 

However,  in  the  first  episode,  Jon  Scott  is  engaged  in  a  question-answer  type 
interaction  where  he  and  Dr  Trenberth  are  not  competing  for  speaking  time,  but  rather 
questions  are  asked  and  responses  are  made.  In  fact,  both  speakers  share  similarities  in 
cluster  combination-of-three  use  in  the  first  episode,  and  this  could  perhaps  be 
indicative  of  a  less  tense  part  of  the  news  debate  interview,  compared  to  the  second 
episode  where  Dr  Trenberth  attempts  to  address  Mr  Ebell’s  arguments  against  him  and 
his  science  colleagues. 

Thus  from  the  cluster  visualizations,  we  can  observe  patterns  in  the  dynamics  of  the 
text  that  can  be  verified  and  investigated  further  upon  reference  back  to  the  actual  text 
itself.  The  interview  is  clearly  divided  into  three  distinct  parts,  with  a  middle  part 
that  is  quite  different  from  the  rest.  An  examination  of  cluster  occurrence  within  each 
episode  has  also  shown  characteristics  that  are  unique,  and  provide  tangible 
preliminary  evidence  for  the  sequential  development  of  any  text  in  stages  which  are 
particular  to  that  register  and  genre. 
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3.3.2  Network  visualizations 


Network  visualizations  also  reveal  significant  patterns  in  the  text,  for  example,  when  a 
speaker  is  continuously  prevented  from  finishing  his  utterances,  or  if  a  speaker  uses 
particular  clusters  repeatedly  or  in  sequence.  Figure  14(a)-(c),  for  example,  displays  a 
concentration  of  clusters  and  cluster  relationships  that  are  characteristic  of  Jon  Scott, 
Dr  Trenberth  and  Mr  Ebell  with  regards  to  their  manner  of  organizing  their  discourse, 
while  also  showing  which  clusters  are  outliers  in  this  particular  network  pattern.  These 
patterns  reveal  the  differences  between  the  three  speakers,  as  described  below. 


Interviewer  Jon  Scott  favours  simple  forms  of  textual  organization  (Cluster  4  in  Figure 
14(a))  which  enables  him  to  quickly  focus  on  issues  of  concern,  while  Dr  Trenberth 
frequently  uses  conjunctions  like  ‘and’  and  ‘but’  to  elaborate  and  explain  on  the  points 
he  is  trying  to  make  (Cluster  6  in  Figure  14(b)).  Mr  Ebell  uses  a  wide  range  of  textual 
resources,  including  conversational  continuatives  which  result  in  many  cluster  pairings 
(e.g.  Cluster  3-6,  Cluster  4-6  and  Cluster  4-5  in  Figure  14(c))  and  repetition  of 
simple  forms  of  thematic  organization  (Cluster  4  in  Figure  14(c)).  Mr  Ebell’s  use  of 
simplified  forms  of  repetition  have  the  effect  of  reinforcing  his  arguments  which  are 
delivered  in  a  conversational  style,  compared  to  Dr  Trenberth’s  uneven  attempts  to 
logically  connect  the  events  which  are  under  discussion  during  the  interview. 

Such  patterns,  besides  giving  an  indication  of  speaker  profile,  can  also  contribute 
towards  the  profiling  texts  genres  because  such  semantic  patterns  are  indicative  of 
patterns  at  a  register  level  (e.g.  use  of  interrogatives,  and  interpersonal  vocative  themes 
in  news  debate  interview  contexts). 
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Figure  14(a)  Textual  Metafunction:  Jon  Scott 


Figure  14(b)  Textual  Metafunction:  Dr  Kevin  Trenberth 
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Figure  14(c)  Textual  Metafunction:  Mr  Myron  Ebell 

The  combination  of  time-based  visualization  and  network  visualization  moves  beyond 
more  obvious  measures  like  total  duration  of  speaking  time  to  look  at  what  actually 
happens  during  the  exchange.  For  example,  Mr  Ebell  speaks  for  the  shortest  time 
compared  to  the  other  two  speakers,  but  emerges  as  the  more  dominant  because  of  his 
choices  during  an  extended  period  of  speaking  time,  as  further  discussed  below. 


3.4  Generic  Speaker  Profiles 

The  k-means  clustering  and  network  visualizations  show  different  patterns  for  the 
three  speakers  in  terms  of  type,  directionality  and  frequency.  These  differences  can  be 
seen  as  unique  to  each  speaker,  and  upon  reference  back  to  the  text,  show  differences 
in  semantic  meaning  and  stylistic  preferences.  The  analysis  undertaken  using  Allen’s 
temporal  logic  contributes  further  information  about  the  speaker  profiles,  in  this  case 
for  Mr  Ebell,  the  lobbyist  at  the  Competitive  Enterprise  Institute  and  a  well-known 
climate  denialist,  and  Dr  Trenberth,  an  international  recognized  climate  scientist  who 
was  recipient  of  some  the  hacked  emails  though  not  directly  incriminated  by  them.  As 
we  shall  see,  the  different  agendas  of  the  two  speakers  are  played  out  in  the  televised 
interview  [5], 
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Figure  15(a)  (Partial)  Strip  data  for  Mr  Myron  Ebell 
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Figure  15(b)  (Partial)  Strip  data  for  Dr  Kevin  Trenberth 
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Figure  16(a)  Regions  of  occurrence  of  identified  pattern  for  Mr  Myron  Ebell 


Figure  16(b)  Regions  of  occurrence  for  identified  pattern  for  Dr  Kevin  Trenberth 
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Figures  1 5(a)-(b)'  display  segments  from  two  strips  of  SFT  multimodal  base  data, 
with  an  insert  showing  an  enlarged  portion  of  the  strip.  The  first  strip  is  derived  from 
SFT  multimodal  data  for  Mr  Ebell,  and  the  second,  from  SFT  multimodal  data  for  Dr 
Trenberth.  The  strips  are  made  up  of  ‘units’1 2  (highlighted  in  black  boxes  in  Figures 
15  (a)-(b)),  where  units  consist  of  an  annotation  with  start  and  end  times.  In  each  strip, 
the  bold  units  belong  to  networks  of  patterns  or  ‘clusters’  selected  on  the  basis  of 
frequency  and  strength. 

The  choices  in  the  units  for  the  patterns  are  ‘Interactive  Meaning’  in  the  video 
analysis  and  ‘Process’  in  the  linguistic  analysis.  These  are  key  system  choices  in  the 
inter-semiotic  discoursal  structure  for  visual  images  and  language.  The  larger  pattern 
count  of  73  for  Mr  Ebell  versus  52  for  Dr  Trenberth  over  two  regions  in  the  discourse 
(particularly  in  the  long  middle  segment  of  the  video),  as  compared  to  the  smaller 
discourse  segments  for  Dr  Trenberth  displayed  in  Figures  16(a)-(b),  indicates  a  certain 
consistency  and  concentration  of  this  particular  pattern  type  in  Mr  Ebell ’s 
contributions  to  the  interview.  Mr  Ebell  communicates  most  of  his  information  during 
the  middle  segment  of  the  video  with  little  rebuttal  from  Dr  Trenberth,  and  the 
consistency  and  concentration  of  a  particular  pattern  contributes  to  the  degree  to 
which  Mr  Ebell  is  seen  to  dominate  the  interview,  particularly  in  comparison  to  Dr 
Trenberth,  whose  speaking  turns  are  shorter  and  more  varied.  The  SFT  multimodal 
frameworks  for  linguistic  and  image/video  analysis  are  used  to  investigate  the  impact 
of  Mr  Ebell’s  pattern  of  selections  and  his  perceived  dominance  during  the  interview. 

The  common  units  are  ‘Interactive  Meaning’  and  ‘Process’  for  both  speakers,  with 
sub-categories  which  are  ‘Interactive  Meaning:  Involvement’,  ‘Interactive  Meaning: 
Equality’  for  Mr  Ebell,  and  ‘Interactive  Meaning:  Representation  Power’  and 
‘Interactive  Meaning:  Detachment’  for  Dr  Trenberth.  Thus,  the  significant  pattern  for 
Mr  Ebell  in  terms  of  interactive  meaning  is  his  direct  engagement  with  his  audience, 


1  Relevant  clause  numbers  have  been  inserted  above  each  unit  to  map  them  back  to  the  actual  text. 
Units  of  video  annotation  do  not  have  clause  numbers  inserted. 

2  In  the  visualizations,  each  unit  is  labeled  with  three  pieces  of  information:  the  initials  of  the  speaker 
or  annotation  strip  name,  the  first  three  alphanumeric  characters  of  the  actual  text  and  the  first  three 
alphanumeric  characters  of  the  annotated  semiotic  choice  label. 
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whereas  Dr  Trenberth  choices  make  him  appear  detached  from  his  audience,  though 
supposedly  in  an  elevated  position  of  power,  given  the  low  camera  angle,  as  displayed 
in  Figure  17(a)-(b).  There  is,  to  some  extent,  editorial  bias  in  this  portrayal  of  the  two 
men,  with  a  Skype  format  for  Dr  Trenberth  versus  the  professional  studio  setting  for 
Mr  Ebell.  However,  it  can  be  argued  that  both  men  have  equal  opportunity  to  arrange 
and  organize  their  interviews.  The  settings  give  an  indication  as  to  which  interviewee 
is  more  attuned  to  the  significance  of  media  appearances. 


MYRON  EBELL 

com  COMPETITIVE  ENTERPRISE  INSTITUTE 

ORE  HEADING  TO  OSLO  TO  ACCEPT  THE  NOBEL  P  DOW  10  448  30 


Figure  17(a)  Screenshot  of  Myron  Ebell 


KEVIN  TRENBERTH  HAPPOW4G 

chinntF  E-MAILS  BREACHED  OV 

ADDS  THE  TWO  SIDES  SHOULDCONT1NUE  TO  BO  KBS  2,173.26 


Figure  17(b)  Screenshot  of  Kevin  Trenberth 

The  other  choices  which  are  significant  for  Mr  Ebell  are  ‘Conceptual  Representation: 
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Attributive’  and  ‘Gaze  and  Kinetic  Action  Vectors:  Engaged’.  The  same  choices  for 
Dr  Trenberth  are  not  robust  and  frequent  enough  to  be  highlighted.  From  these 
patterns,  we  may  see  how  Mr  Ebell  portrays  a  consistent  and  impactful  visual 
impression  where  he  establishes  rapport  with  his  audience  and  projects  a  credible 
image  foregrounded  against  a  background  of  Capitol  Hill  in  Washington  DC. 

These  video  annotation  units  appear  with  the  linguistic  units  of  ‘Process’  for  verbs  or 
verb  phrases.  The  ‘Process’  relates  participants  and  circumstances  in  the  clause 
(Halliday  and  Matthiessen  2004),  in  this  case  for  ‘Material’  and  the  ‘Relational’ 
process  types.  The  Material  process  concerns  an  action  or  happening,  while  the 
Relational  process  is  concerned  with  states  of  being  and  making  sense  of  the  world  by 
relating  concrete  and  abstract  concepts  to  each  other.  Mr  Ebell  and  Dr  Trenberth  both 
select  Relational  processes  most  often  (41.2%  and  46.2%  respectively).  They  both 
also  select  Material  processes  (27.9%  and  19.2%)  as  the  next  most  frequent  choice, 
with  a  higher  relative  occurrence  for  Mr  Ebell. 

The  clauses  in  which  Mr  Ebell  and  Dr  Trenberth  select  Relational  and  Material 
process  reveal  a  focus  on  what  the  climate  scientists  are  (or  are  not),  and  what  they 
have  done  (or  have  not  done).  However,  even  though  both  interviewees  are  focusing 
on  the  same  participants  and  their  actions,  the  resultant  effect  is  different.  For  Mr 
Ebell,  the  focus  on  the  actions  of  climate  scientists  functions  to  position  him  as  an 
accuser  who  questions  and  challenges  the  moral  standards  of  the  climate  scientists  by 
focusing  on  the  ethics  of  their  actions,  characterizing  them  as  immoral  individuals.  He 
conveniently  associates  Dr  Trenberth  with  this  group  -  though  he  does  remark  that  Dr 
Trenberth  is  “not  one  of  the,  sort  of,  main  gang  leaders”  here,  with  surprisingly  little 
verbal  response  from  Dr  Trenberth  himself,  apart  from  an  initial  reaction  of  surprise 
and  a  wry  smile.  Mr  Ebell  is  then  given  the  freedom  to  accuse  this  group,  and 
consequently,  Dr  Trenberth,  of  intentionally  giving  an  inaccurate  picture  of  what  is 
really  happening  with  the  earth’s  climate,  with  once  again  little  intervention  from  Dr 
Trenberth  other  than  for  brief  moments  before  he  launches  into  an  almost  desperate 
attempt  to  insert  as  much  information  denying  Mr  Ebell’s  accusations  as  the 
interviewer  Jon  Scott  attempts  to  close  the  interview. 
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On  the  contrary,  Dr  Trenberth’s  strategy  of  focusing  on  the  climate  scientists  and  their 
actions  puts  him  on  the  defensive,  given  the  previous  media  reports  and  online 
information  which  focus  on  the  seemingly  incriminating  evidence  from  the  emails  of 
scientists  manipulating  data,  playing  ‘tricks’  and  restricting  access  to  information. 
Thus,  the  onus  is  on  Dr  Trenberth  to  disprove  these  assertions,  rather  than  on  Mr  Ebell 
to  prove  the  correctness  of  his  assertions.  Dr  Trenberth’s  focus  on  himself  and  his 
contemporaries  puts  him  on  the  defensive  because  he  does  not  provide  any  evidence 
to  counter  Mr  Ebell’s  attempts  at  character-assassination,  other  than  to  say  the 
opposite  of  what  Mr  Ebell  is  saying,  or  worse,  to  even  admit  that  what  Mr  Ebell  is 
alleging  might  be  true,  except  that  he  himself  is  not  guilty. 

In  addition,  Dr  Trenberth’s  failure  to  respond  adequately  and  forcefully  to  Mr  Ebell 
who  develops  his  argument  in  his  longest  utterance  spread  across  slightly  more  than 
forty  clauses,  gives  Mr  Ebell  the  dialogic  space  to  state  his  case  freely,  and  thus 
allows  him  to  dominate  the  discourse,  dayman  and  Heritage  (2002)  have  defined  the 
interview  genre  as  akin  to  gladiatorial  combat  between  interviewer  and  interviewee. 
We  posit  that  this  combat  exists  between  two  interviewees  who  represent  different 
views  on  the  same  topic.  Thus,  Dr  Trenberth’s  reluctance  or  inability  to  wrest  dialogic 
space  from  Mr  Ebell  allows  him  time  and  opportunity  to  forcefully  advance  his 
argument,  which  Dr  Trenberth  ultimately  fails  to  counteract  for  two  reasons.  Firstly, 
the  news  debate  interview  genre  assigns  overall  authority  to  an  interviewer  who  has 
most  control  over  how  the  interview  develops,  and  the  interviewer  here  does  not  give 
Dr  Trenberth  much  opportunity  to  refute  Mr  Ebell’s  arguments.  Secondly,  Dr 
Trenberth  does  not  attempt  to  attack  Mr  Ebell’s  credibility  except  when  he  says  ‘Well, 
that’s  certainly  a  shameful  comment’  and  ‘Your  charges  are  just  completely  false’.  But 
even  then,  he  either  does  not  continue  from  there  or  merely  continues  to  claim  the 
opposite  of  what  Mr  Ebell  has  said.  Mr  Ebell  renders  such  challenges  ineffective 
because  he  has  already  established  doubt  about  Dr  Trenberth’s  credibility  and  made 
explicit  what  these  climate  scientists  have  done  to  give  an  inaccurate  picture  of  the 
dangers  facing  the  world  as  a  result  of  global  warming. 
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3.5  Mathematical  Modeling  of  SFT  Base  Data 


Using  mathematical  modeling,  we  see  how  the  resulting  visualizations  of  the  SFT 
multimodal  base  data  differ  for  the  two  interviewees.  By  relating  these  visualization 
patterns  to  the  text  and  applying  SFT  frameworks,  we  have  provided  a  comprehensive 
account  for  why  Mr  Ebell  seems  to  have  been  successful  in  this  interview.  The  benefit 
of  the  mathematical  modeling  techniques  is  that  the  patterns  which  emerge  correspond 
with  expectations  derived  from  SFT  multimodal  base  data,  as  corroborated  by  an 
expert  human  analyst.  Thus  the  approach  can  be  seen  as  a  meaningful  scientific 
methodology  that  employs  a  simultaneous  top-down  contextual  view  and  bottom-up 
grammatical  view  to  interpret  semantic  patterns  in  multimedia  data. 

In  summary,  our  analysis  has  shown  how  the  visual  disengagement  of  Dr  Trenberth, 
together  with  his  linguistic  and  content  choices  and  inability  to  take  action  to 
effectively  challenge  Mr  Ebell  when  necessary,  make  for  a  less  impactful  and 
convincing  argument,  as  compared  to  that  of  Mr  Ebell,  who  is  visually  more  engaging 
and  shows  his  understanding  of  the  news  debate  interview  genre  by  not  relinquishing 
his  hold  on  the  dialogic  space  of  the  interview,  as  long  as  it  is  not  demanded  by  the 
interviewer,  who  is  normally  recognized  as  the  institutional  authority  with  regard  to 
how  the  interview  is  conducted  (Budd,  Craig  and  Steinman  1999),  and  by  using 
efficiently  whatever  time  he  has  been  given  to  put  forward  his  arguments. 

It  is  clearly  not  facts  and  reputation  that  help  to  win  over  an  audience  in  a  news  debate 
interview.  This  is  apparent  in  how  Dr  Trenberth,  even  with  his  knowledge  about 
climate  change  and  his  credibility  as  a  Nobel  prize-winning  scientist,  ends  up 
desperately  trying  to  regain  ground  towards  the  end  of  the  interview.  Perhaps,  because 
of  the  visual  nature  of  the  television  news  debate  interview,  the  person  takes  centre 
stage,  where  credibility  is  not  established  through  logical  argument  or  one’s  reputation, 
but  through  a  populist  yardstick  based  on  information  that  is  easily  accessed  and 
repeated  endlessly  in  a  public  domain  by  media  that  may  prioritize  one  particular 
perspective  over  another  as  a  result  of  its  own  agenda  -  news  reports  that  sell 
(Weingart,  Engels  and  Pansegrau  2000)  and  a  visual  accessibility  that  attracts 
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attention  and  positive  evaluation  from  viewers. 


Thus,  our  methodology  of  combining  mathematical  modeling  and  SFT  multimodal 
analysis  has  shown  two  advantages:  one,  that  there  is  a  way  to  objectively  combine 
the  overwhelmingly  numerous  linguistic  and  visual  choices  made  in  a  multimedia  text 
and  make  sense  of  these  seemingly  disparate  choices  over  the  dimension  of  time;  and 
two,  that  the  patterns  derived  via  mathematical  modeling  and  its  resultant 
visualizations  can  be  interpreted  through  theoretical  frameworks  that  imbibe  these 
patterns  with  meaning.  Moreover,  the  interpretations  arising  from  the  analysis  can 
help  us  understand  more  about  the  ideological  implications  of  digital  communication 
today. 


4.  Future  Work 

“To  say  we  move  in  a  new  world,  the  digital  information  age,  is  already  a  cliche.  Our 
challenge  appears  to  be  the  navigation  through  and  adaptation  to  not  so  much  an 
actual,  material  environment  but  the  virtual  semiotic,  informational  environment —  an 
environment  of  our  own  making,  incorporating  the  discourses  of  many  millions  of 
multiliterate  social  agents;  and  yet  an  evolved  rather  than  designed  environment” 
(O’Halloran  &  Smith,  in  press). 

The  project  has  reinforced  and  extended  existing  research  findings  concerning  the 
communication  of  climate  science  in  the  public  domain,  showing  how  the  media  plays 
a  powerful  role  in  influencing  how  members  of  the  public  perceive  both  scientific 
knowledge  and  the  scientific  community  itself  (Boykoff  and  Boykoff  2004;  Boykoff 
2011;  Carvalho  2007).  We  have  also  demonstrated  how  mathematical  modeling  of 
SFT  multimodal  data  can  contribute  to  our  understanding  of  how  events  are  construed 
and  reported  by  different  text  producers.  Such  a  methodology  can  be  extended  to  any 
domain  of  private  and  public  activity.  In  this  project,  we  chose  the  financial  crisis  and 
climate  change  due  to  the  global  significance  of  these  events  in  the  world  today. 


The  techniques  developed  in  this  project  point  to  the  future  for  data  analysis,  search 
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and  retrieval  as  an  integral  component  of  visual  analytics  software  for  social  cultural 
modeling.  With  the  perpetuation  of  such  trends  and  the  instantiation  of  instabilities 
being  carried  through  modes  of  communication  made  easy  and  more  diverse  in  an 
increasingly  advanced  digital  age,  we  need  to  critically  examine  instances  of  human 
communication  in  its  various  multimodal  forms  to  make  sense  of  how  societies  and 
cultures  maintain  and  perpetuate  the  very  ideas,  beliefs,  values  and  principles  which 
drive  their  very  existence.  Mathematical  modeling  multimodal  communication  will 
enable  us  to  understand  the  increasingly  complex  and  dynamic  world  we  now  live  in, 
with  view  to  identifying  and  tracking  evolving  semantic  patterns,  in  particular  those 
related  to  stability  and  instability  in  a  rapidly  changing  world  which  is  facing  many 
immediate  challenges.  Future  work  in  the  Multimodal  Analysis  Lab  will  focus  on 
developing  more  sophisticated  mathematical  modeling  approaches  with  view  to 
integrating  these  techniques  in  existing  visual  analytics  software  for  socio-cultural 
modeling. 
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